Student Team: YES
Tableau 8.1
Qlikview 11
Access (2010)
Excel (2013)
Python 2.7.3
Wordle.net
(http://www.wordle.net/)
RAW
(http://raw.densitydesign.org/)
Camtasia Studio 8
Approximately
how many hours were spent working on this submission in total?
200 hours
May we post
your submission in the Visual Analytics Benchmark Repository after VAST
Challenge 2014 is complete? YES
Video:
https://www.flickr.com/photos/124558678@N06/14587539366/
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
Please note - this challenge
contains a question that is time-dependent.
Within 3 hours of starting the final data stream, send an email to VASTChal2014MC3@vacommunity.org containing your answer to question
MC3.1. Please include a copy of your
answer to MC3.1 in your final answer form also. Your answers to MC3.2 and
MC3.3, along with your video, are due July 8.
The
responses to these questions should be incrementally built, as you (the
contestant) acquire information from each streaming data segment you
receive. Your submission will answer these questions in consideration of
all of the streaming data segments.
MC3.1 -
Within 3 hours after start the final data stream, send an email to VASTChal2014MC3@vacommunity.org containing:
a.
An image showing the streaming data in
your visual analytics tool. In this image, identify an event of interest that
you intend to investigate further.
b. The content of the final message in the data stream
We used “Qlikview” to display the time series of
streaming records. The number of
messages per second on 23 January 2014 as a function of time is shown as a red
line for Microblog Messages (mbdata) and a blue line for Call Center Data
(ccdata). A peak in the number of mbdata and ccdata messages received occurs at
20:10. At this time, a shooting is taking place and Police apparently are
beginning negotiations with terrorists.
The content of the last message is:
mbdata|RT @KronosStar There has been an explosion
from inside the apartment building.
Several people are down. #KronosStar #DancingDolphinFire
#AFDHeroes|20140123213200|0|gardener4958|RT @KronosStar There has been an
explosion from inside the apartment building.
Several people are down. #KronosStar #DancingDolphinFire #AFDHeroes|||
This content was modified using Java to generate
something like:
line =
Joiner.on("|").useForNull("[N/A]").join(Arrays.asList(msgType,
content,
dateTime,i,
author, content, lat, longitude,location));
MC 3.2 - Describe the timeline of up to
five major events that you discover in the streaming data. This timeline should
include information from all three segments of the data stream if needed.
Use specific microblog records and call center data to support
your description, but do not simply mimic back the data stream. Provide a
concise description of important participants, locations and durations.
Focus your response on the events themselves, rather than on the individuals
reporting the events. Please limit your answer to no more than ten images and
1500 words.
In order to identify and exclude spam and junk
messages, we analyzed the number of messages by author. We counted the number
of messages by each user. For messages for the same user, we counted how many
contained the exact same text. To quantify text repetitions, we built a spam
index by dividing the number of messages with different content sent by a user
over the total number of messages sent by that user. A low spam index value
indicates many messages with the same text and thus is associated with a high
likelihood of spam.
As shown in the bar graph above, KronosQuoth and Clevvah4Evah sent large numbers of messages
(1,265 and 153, respectively), but their spam index is very low. We found that
the contents of these messages were not relevant to the questions asked, thus
we considered them spam or junk and excluded them from further analyses.
Timeline of Events
We detected four distinct events by
examining the pattern of repeating hashtags within short periods:
·
Rally
of the Protectors of Kronos (POK) in Abila City Park;
·
Fire
at the Dancing Dolphin apartment complex;
·
Black
van hit and run;
·
Shooting
at Gelato Galore ice cream parlor.
Event
1.
The first event took place on 23 January 2014 starting at 17:00. This event was
a rally in Abila City Park organized by the Protectors of Kronos (POK). The
park is bound by Pilau St. on the west, Parla St. on
the east, Achilleos St. and Ermou St. on the north
and Egeou St. on the south.
The Protectors of Kronos (POK) is a
political activist movement that was started in 1997 as a small group of seven
citizens concerned about contamination from drilling at the Tiskele Bend gas
fields in Kronos. One of its charismatic leaders was Elian Karel, who died on
19 June 2009 – apparently from a heart attack – after being held in prison for
three months.
The rally hostess is Sylvia Marek, one of
the leaders of POK. She is also the leader and co-founder of Save Our Wildlands
(SOW), a small environmental activist group associated with POK. Special guests
at the rally include (a) Dr. Audrey McConnel Newman, internationally renowned
environmental scientist from the United States, (b) Lucio Jakab, cofounder of
SOW with Sylvia Marek, and (c) Professor Lorenzo Di Stefano, who teaches Environmental
Science at the University of Abila. The band Victor-E is playing at the
rally. A timeline of events occurring
during the rally is shown in the table below. In all events, the different
developments and their times are extracted from the set of Twitter and Call
Center data.
Event
2.
The second event started at 18:25 and involved a fire at the Dancing Dolphin
apartment complex, located at the corner of N Achilleos St. and N Madeg St. A
timeline of events occurring during the fire is shown in the table below.
Event
3.
The third event involves two consecutive hit and run incidents starting at
19:19. The driver of a black van hits a car first and a cyclist afterwards. A
timeline of developments in this event is shown in the table below.
Event
4.
The fourth event is a shooting, starting at 19:39, in the parking lot of
“Gelato Galore”, an ice cream parlor in the corner of N Alexandrias St. and N
Ithakis St., near Abila City Park. A
timeline of events occurring during the shooting is shown in the table below.
The next graph shows the starting time and
approximate duration of the four detected events. Data considered: hashtags
from mbdata and all ccdata. Hashtags that were considered unrelated to the four
events are not displayed.
We used the hashtags in the messages to
identify the events starting and approximate ending times. The first event, POK
Rally, starts at 17:00 and ends around 20:15. Hashtags such as POKRally, Rally,
POK, and POKrallyinthepark are detected.
The second event, Fire in Dancing Dolphin
apartment complex, starts around 18:40. Twits with hashtags such as Fire,
DancingDolphinFire, AFD (Abila Fire Department), AFDheroes are discovered.
The third event, Black van hit and run,
starts at 19:20; ccdata like “All Units Broadcast Felony Hit and Run – in
progress” and the hashtags Jerkdrivers and APD (Abila Police Department) are
used.
The last event, Shooting
at Gelato Galore, starts at 19:40; the hashtags Troubleatgelato, TAG,
Gelatogalorestandoff, Blackvan and Hostage are identified during this event.
The Stream graph bellow complements the
previous visualization.
For a better visualization we grouped
hashtags and ccdata that are related. The groups are defined as follows:
·
Abila: Abila,
Abilacitypark, Abilafinest, Abilajobs, Abilaparadise, Abilaprays, Abilafinest,
Abilawatcher
·
AFD: AFD, AFDheroes,
AFDheros
·
APD: APD
·
Fire: DancingolphinFire,
Dancingdolfinfire, Dancingdolphin, Dancingdolphinsfire, Dancingdophinsfire,
Dancingfire, Dansingdolfinfire, Dansingdolphinfire
·
Gelato: Troubleatgelato,
TAG, Standoffover, Standoff, Shooting, GG, Gelatogalorestandofff, Gelatogalore
·
News Media: HI, KronosStar,
AbilaPost, IntNews, NewsOnline, CentralBulletin
·
POK Rally: POKRally, Rally,
POKRallyinthepark, POKliesinthepark, Park, Parkcheck, Rallypark
·
Van Accident: Jerkdrivers,
Pursuit Continues, Pursuit, Suspicious Occupied Vehicle-Black Van, Vehicle
Accident-Report
We can visualize that the hashtags related
to the POK Rally event are used between 17:00 and 19:15. Hashtags related to
Abila and News Media are used the whole time. The APD hashtag has periods
during which it is not used. The hashtags grouped in Fire and AFD start around
18:40. The hashtags used in Van Accident are used between 19:10 and 19:40. And
hashtags grouped in Gelato are used between 19:30 and 21:00.
Following visualization is a mood analysis
graph, subjectivity (grey line) and polarity feelings (green and red bars) are
displayed.
The grey peaks show high subjectivity in the
messages, the highest peak takes place when the shooting at Gelato Galore
starts. The other peaks occur during the speeches at the rally, when the fire
start at the Dancing Dolphin apartments and during the hit and run incident.
Over all, the messages show positive
feelings. The strongest negative feelings appear when the fire and shooting
start, consistently with the fear expressed in messages.
MC 3.3 – Select one of your five major
events from question MC 3.2 that you consider to be most likely to provide
additional clues to the investigation of the GASTech
disappearances. Describe the roles of the participants.
Describe how other events you identified in MC3.2 may have influenced your
selected event. Provide a hypothesis and evidence as to whom you suspect as being
directly involved in the GAStech disappearances, either as perpetrators or
victims. Please limit your response to no more than five images and 500
words.
We consider the shooting at Gelato Galore as
the event that can provide additional clues to the investigation of the GASTech
disappearances. For this reason we decided to assess the messages that were
sent between 19:30 and 21:32.
To identify the participants in this event
we prepared a tree map by author, in the visualization the importance of their
participation is displayed.
The authors that sent more messages during
the event are:
We decided to
build a “word cloud” based on text from messages sent during the event by the
previous 10 authors.
We can detect more participants based on the
previous image:
Police: They blocked the
suspicious black van at Gelato Galore. They engage in a firefight with the
van’s driver, a policeman is wounded.
They resist the shooting until the SWAT team arrives. When SWAT arrives,
they start evacuating “Carly’s Coffee” and “General Grocer”.
SWAT
team:
They arrived at scene at Gelato Galore. They negotiate with the terrorists and
finally they release the hostages when the terrorists surrender.
Terrorist: he shoots a
policeman, he seems out of his mind. He shouts during the negotiations and
threatens to kill a hostage.
Hostages: Nobody can see
them. Two women hostages were rescued and safe but no names were released.
The shooting event is related to the Van hit
and run event because the same black van is involved. The van driver after hitting a car and
running over a cyclist escapes from the scene.
Being chased by the police, he enters a parking lot with no exit,
feeling trapped he starts shooting at the police that was blocking his
way.
In the next Radar graph we can visualize the
connection between both events.
We can conclude that two of the disappeared
GASTech employees (two women), were the ones inside the black van. They were rescued because their captors hit a
car and run over a cyclist, and that event set off the police chase to the
black van.
We consider as a suspect the author with
username Officia1AbilaPost, first because this name can be mistaken as an
“Official Abila Post”, only switching the 1 for an l, and second because this
author sends messages with misinformation regarding the disappeared employees,
trying to divert attention. We found
this author because s/he used the hashtag #GASTech.
Considering information from Challenge 1, we
think that one of the hostages is Rachel Pantanal, Executive Assistant of
GASTech CIO. And one of the kidnappers
is Isia Vann, the brother of Juliana Vann.
We arrived to this conclusion when we
searched the twits trying to find out if someone mention a GASTech
employee. A message said that s/he had
not heard for several days from Rachel.
We also found in Challenge 1, emails from
Isia Vann that we considered are harassment towards Rachel.